Part 3: Modelling with R
In this third part of the studio, we’ll use R to model Aedes albopictus distribution in Northern Italy. For that, we need to connect to GRASS via the rgrass package by Bivand (2022) in order to read occurrence data and predictors.
rgrass
-
initGRASS(): starts a GRASS GIS session from R -
execGRASS(): executes GRASS GIS commands -
gmeta(): shows GRASS location metadata -
read_VECT()andread_RAST(): read vector and raster maps from GRASS into terra objects. -
write_VECT()andwrite_RAST(): write terra objects into GRASS GIS database
Package terra is developed by Hijmans (2022) and will eventually replace raster.
GRASS GIS and R can be used together in two ways:
A. Using R within a GRASS GIS session, i.e. starting R (or RStudio) from GRASS terminal
- type
Rorrstudio &in the GRASS GIS terminal - load
rgrasslibrary - use
read_VECT(),read_RAST()to read data from GRASS into R - access GRASS GIS modules and database through
execGRASS() - write data (back) to GRASS database with
write_VECT()andwrite_RAST()
B. Using GRASS GIS within an R session, i.e. we connect to GRASS GIS database from within R (or RStudio).
- we need to start GRASS GIS with
initGRASS()from R - we access GRASS GIS modules through
execGRASS() - use
read_VECT(),read_RAST(),write_VECT()andwrite_RAST()to read data from and to GRASS database
Originally intended to apply GRASS functions on data outside GRASS database; hence some prefer to create throw away locations
Let’s move to R
Load packages needed
Option B: Close GRASS, open Rstudio and run:
# path to GRASS binaries (run `grass --config path`)
grassbin <- "/usr/lib64/grass82"
# path to GRASS database
grassdata <- "/home/veroandreo/grass_ncsu_2023/grassdata/"
# path to location
location <- "eu_laea"
# path to mapset
mapset <- "italy_LST_daily"
# start GRASS GIS from R
initGRASS(gisBase = grassbin,
home = tempdir(),
gisDbase = grassdata,
location = location,
mapset = mapset,
override = TRUE,
remove_GISRC= TRUE)Read vector data
Read raster data
# List rasters by pattern
worldclim <- execGRASS("g.list",
parameters = list(type = "raster",
pattern = "worldclim*"))
avg <- execGRASS("g.list",
parameters = list(type = "raster",
pattern = "avg*"))
median <- execGRASS("g.list",
parameters = list(type = "raster",
pattern = "median*",
exclude = "*[1-5]"))
# Concatenate map lists
to_import <- c(attributes(worldclim)$resOut,
attributes(avg)$resOut,
attributes(median)$resOut)
# Read raster layers
predictors <- list()
for (i in to_import){
predictors[i] <- raster(read_RAST(i)) }
# Stack rasters
predictors_r <- raster::stack(predictors)Data preparation
# Variables for models
sp <- "Aedes albopictus"
presence <- st_coordinates(presence)
background <- st_coordinates(background)
env <- predictors_r
# Prepare data: SWD
data_sp <- prepareSWD(species = sp,
p = presence,
a = background,
env = env)
data_sp## Object of class SWD
##
## Species: Aedes albopictus
## Presence locations: 83
## Absence locations: 1000
##
## Variables:
## ---------
## Continuous: worldclim_bio01 worldclim_bio02 worldclim_bio03 worldclim_bio04 worldclim_bio05 worldclim_bio06 worldclim_bio07 worldclim_bio10 worldclim_bio11 avg_autumnal_cooling avg_count_tmean_higher20_lower30 avg_spring_warming median_lower_m2_consec_days
## Categorical: NA
Define relevant variables
Create train and test datasets
## Object of class SWD
##
## Species: Aedes albopictus
## Presence locations: 66
## Absence locations: 1000
##
## Variables:
## ---------
## Continuous: worldclim_bio01 worldclim_bio02 worldclim_bio03 worldclim_bio04 worldclim_bio05 worldclim_bio06 worldclim_bio07 worldclim_bio10 worldclim_bio11 avg_autumnal_cooling avg_count_tmean_higher20_lower30 avg_spring_warming median_lower_m2_consec_days
## Categorical: NA
## Object of class SWD
##
## Species: Aedes albopictus
## Presence locations: 17
## Absence locations: 1000
##
## Variables:
## ---------
## Continuous: worldclim_bio01 worldclim_bio02 worldclim_bio03 worldclim_bio04 worldclim_bio05 worldclim_bio06 worldclim_bio07 worldclim_bio10 worldclim_bio11 avg_autumnal_cooling avg_count_tmean_higher20_lower30 avg_spring_warming median_lower_m2_consec_days
## Categorical: NA
Create folds for cross-validation
Train a default Maxent model with CV
# Train a full model
full_model_sp <- train(method = method,
data = train_sp,
folds = ran_folds)
full_model_sp## Object of class SDMmodelCV
## Method: Maxent
##
## Species: Aedes albopictus
## Replicates: 4
## Presence locations: 66
## Absence locations: 1000
##
## Model configurations:
## --------------------
## fc: lqph
## reg: 1
## iter: 500
##
## Variables:
## ---------
## Continuous: worldclim_bio01 worldclim_bio02 worldclim_bio03 worldclim_bio04 worldclim_bio05 worldclim_bio06 worldclim_bio07 worldclim_bio10 worldclim_bio11 avg_autumnal_cooling avg_count_tmean_higher20_lower30 avg_spring_warming median_lower_m2_consec_days
## Categorical: NA
Remove less important variables
# remove less important variables only if auc does not decrease
reduc_var_sp <- reduceVar(vs_sp,
th = 10,
metric = "auc",
test = TRUE,
permut = perm,
use_jk = TRUE)
reduc_var_sp## Object of class SDMmodelCV
## Method: Maxent
##
## Species: Aedes albopictus
## Replicates: 4
## Presence locations: 66
## Absence locations: 1000
##
## Model configurations:
## --------------------
## fc: lqph
## reg: 1
## iter: 500
##
## Variables:
## ---------
## Continuous: worldclim_bio04 worldclim_bio07 avg_count_tmean_higher20_lower30
## Categorical: NA
We need now to recreate SWD only with the selected variables to run the final model and make predictions.
# Get only relevant variables from the reduced model
retained_varnames <- names(reduc_var_sp@models[[1]]@data@data)
# Subset raster stack
env <- subset(env, retained_varnames)
# SWD with the selected vars
subset_train_sp <- prepareSWD(species = sp,
p = presence,
a = background,
env = env)
c(train_sp, test_sp) %<-%
trainValTest(subset_train_sp,
test = perc_test,
only_presence = TRUE,
seed = seed)Run the best models with the full train data
Make predictions with the final model
Write raster of final model predictions to GRASS
## WARNING: Raster map <Aedes_albopictus_maxent> already exists and will be
## overwritten
## Over-riding projection check
## Importing raster map <Aedes_albopictus_maxent>...
## 0% 3% 6% 9% 12% 15% 18% 21% 24% 27% 30% 33% 36% 39% 42% 45% 48% 51% 54% 57% 60% 63% 66% 69% 72% 75% 78% 81% 84% 87% 90% 93% 96% 99% 100%
Model evaluation
# Threshold dependent evaluation
th_maxent <- thresholds(final_model_sp,
type = "cloglog",
test = test_sp)
knitr::kable(th_maxent, format = 'html', digits = 2)| Threshold | Cloglog value | Fractional predicted area | Training omission rate | Test omission rate | P-values |
|---|---|---|---|---|---|
| Minimum training presence | 0.04 | 0.95 | 0.00 | 0 | 0 |
| Equal training sensitivity and specificity | 0.50 | 0.28 | 0.27 | 0 | 0 |
| Maximum training sensitivity plus specificity | 0.51 | 0.24 | 0.30 | 0 | 0 |
| Equal test sensitivity and specificity | 0.53 | 0.18 | 0.47 | 0 | 0 |
| Maximum test sensitivity plus specificity | 0.53 | 0.17 | 0.47 | 0 | 0 |
Variable importance in final model
## Variable Percent_contribution Permutation_importance
## 1 avg_count_tmean_higher20_lower30 73.4133 69.7786
## 2 worldclim_bio07 21.2599 12.7664
## 3 worldclim_bio04 5.3268 17.4550
Response curves in final model
Disclaimer
This is only a toy example and only the beginning…